Dataset statistics
| Number of variables | 20 |
|---|---|
| Number of observations | 214679 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 100.7 MiB |
| Average record size in memory | 491.9 B |
Variable types
| Numeric | 12 |
|---|---|
| DateTime | 1 |
| Categorical | 7 |
Company has constant value "Yellow Cab" | Constant |
df_index is highly correlated with Transaction ID and 1 other fields | High correlation |
Transaction ID is highly correlated with df_index and 1 other fields | High correlation |
KM Travelled is highly correlated with Cost of Trip | High correlation |
Cost of Trip is highly correlated with KM Travelled | High correlation |
Population is highly correlated with Users | High correlation |
Users is highly correlated with Population | High correlation |
Year is highly correlated with df_index and 1 other fields | High correlation |
Company is highly correlated with Year and 5 other fields | High correlation |
Year is highly correlated with Company | High correlation |
City is highly correlated with Company | High correlation |
Holiday is highly correlated with Company | High correlation |
Day of Week is highly correlated with Company | High correlation |
Payment_Mode is highly correlated with Company | High correlation |
Gender is highly correlated with Company | High correlation |
df_index has unique values | Unique |
Transaction ID has unique values | Unique |
Reproduction
| Analysis started | 2021-02-27 19:44:25.389760 |
|---|---|
| Analysis finished | 2021-02-27 19:45:40.595617 |
| Duration | 1 minute and 15.21 seconds |
| Software version | pandas-profiling v2.10.1 |
| Download configuration | config.yaml |
| Distinct | 214679 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 179235.1597 |
|---|---|
| Minimum | 0 |
| Maximum | 359391 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 1.6 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 17282.9 |
| Q1 | 90006.5 |
| median | 179319 |
| Q3 | 268392 |
| 95-th percentile | 341682.1 |
| Maximum | 359391 |
| Range | 359391 |
| Interquartile range (IQR) | 178385.5 |
Descriptive statistics
| Standard deviation | 103834.9944 |
|---|---|
| Coefficient of variation (CV) | 0.5793226875 |
| Kurtosis | -1.194441126 |
| Mean | 179235.1597 |
| Median Absolute Deviation (MAD) | 89187 |
| Skewness | 0.003038595775 |
| Sum | 3.847802486 × 1010 |
| Variance | 1.078170607 × 1010 |
| Monotocity | Strictly increasing |
| Value | Count | Frequency (%) |
| 6141 | 1 | < 0.1% |
| 351230 | 1 | < 0.1% |
| 97146 | 1 | < 0.1% |
| 91001 | 1 | < 0.1% |
| 93048 | 1 | < 0.1% |
| 72566 | 1 | < 0.1% |
| 78707 | 1 | < 0.1% |
| 119663 | 1 | < 0.1% |
| 117612 | 1 | < 0.1% |
| 129898 | 1 | < 0.1% |
| Other values (214669) | 214669 |
| Value | Count | Frequency (%) |
| 0 | 1 | |
| 1 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 7 | 1 | |
| 10 | 1 | |
| 11 | 1 | |
| 12 | 1 | |
| 13 | 1 | |
| 16 | 1 |
| Value | Count | Frequency (%) |
| 359391 | 1 | |
| 359382 | 1 | |
| 359378 | 1 | |
| 359377 | 1 | |
| 359376 | 1 | |
| 359375 | 1 | |
| 359374 | 1 | |
| 359373 | 1 | |
| 359372 | 1 | |
| 359371 | 1 |
| Distinct | 214679 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10220316.1 |
|---|---|
| Minimum | 10000384 |
| Maximum | 10439521 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.6 MiB |
Quantile statistics
| Minimum | 10000384 |
|---|---|
| 5-th percentile | 10021446.9 |
| Q1 | 10111083.5 |
| median | 10220504 |
| Q3 | 10329683.5 |
| 95-th percentile | 10418340.1 |
| Maximum | 10439521 |
| Range | 439137 |
| Interquartile range (IQR) | 218600 |
Descriptive statistics
| Standard deviation | 126960.4235 |
|---|---|
| Coefficient of variation (CV) | 0.01242235781 |
| Kurtosis | -1.192876614 |
| Mean | 10220316.1 |
| Median Absolute Deviation (MAD) | 109300 |
| Skewness | 0.003114962661 |
| Sum | 2.19408724 × 1012 |
| Variance | 1.611894914 × 1010 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 10386066 | 1 | < 0.1% |
| 10228297 | 1 | < 0.1% |
| 10236009 | 1 | < 0.1% |
| 10354766 | 1 | < 0.1% |
| 10294058 | 1 | < 0.1% |
| 10246938 | 1 | < 0.1% |
| 10228138 | 1 | < 0.1% |
| 10239439 | 1 | < 0.1% |
| 10294045 | 1 | < 0.1% |
| 10077640 | 1 | < 0.1% |
| Other values (214669) | 214669 |
| Value | Count | Frequency (%) |
| 10000384 | 1 | |
| 10000385 | 1 | |
| 10000386 | 1 | |
| 10000387 | 1 | |
| 10000388 | 1 | |
| 10000389 | 1 | |
| 10000390 | 1 | |
| 10000391 | 1 | |
| 10000392 | 1 | |
| 10000393 | 1 |
| Value | Count | Frequency (%) |
| 10439521 | 1 | |
| 10439514 | 1 | |
| 10439505 | 1 | |
| 10439504 | 1 | |
| 10439485 | 1 | |
| 10439451 | 1 | |
| 10439444 | 1 | |
| 10439437 | 1 | |
| 10439428 | 1 | |
| 10439427 | 1 |
Date of Travel
Date
| Distinct | 1095 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.6 MiB |
| Minimum | 2016-01-02 00:00:00 |
|---|---|
| Maximum | 2018-12-31 00:00:00 |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 13.7 MiB |
| Yellow Cab |
|---|
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
Characters and Unicode
| Total characters | 2146790 |
|---|---|
| Distinct characters | 9 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Yellow Cab |
|---|---|
| 2nd row | Yellow Cab |
| 3rd row | Yellow Cab |
| 4th row | Yellow Cab |
| 5th row | Yellow Cab |
| Value | Count | Frequency (%) |
| Yellow Cab | 214679 |
| Value | Count | Frequency (%) |
| yellow | 214679 | |
| cab | 214679 |
Most occurring characters
| Value | Count | Frequency (%) |
| l | 429358 | |
| Y | 214679 | |
| e | 214679 | |
| o | 214679 | |
| w | 214679 | |
| 214679 | ||
| C | 214679 | |
| a | 214679 | |
| b | 214679 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1502753 | |
| Uppercase Letter | 429358 | 20.0% |
| Space Separator | 214679 | 10.0% |
Most frequent character per category
| Value | Count | Frequency (%) |
| l | 429358 | |
| e | 214679 | |
| o | 214679 | |
| w | 214679 | |
| a | 214679 | |
| b | 214679 |
| Value | Count | Frequency (%) |
| Y | 214679 | |
| C | 214679 |
| Value | Count | Frequency (%) |
| 214679 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1932111 | |
| Common | 214679 | 10.0% |
Most frequent character per script
| Value | Count | Frequency (%) |
| l | 429358 | |
| Y | 214679 | |
| e | 214679 | |
| o | 214679 | |
| w | 214679 | |
| C | 214679 | |
| a | 214679 | |
| b | 214679 |
| Value | Count | Frequency (%) |
| 214679 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2146790 |
Most frequent character per block
| Value | Count | Frequency (%) |
| l | 429358 | |
| Y | 214679 | |
| e | 214679 | |
| o | 214679 | |
| w | 214679 | |
| 214679 | ||
| C | 214679 | |
| a | 214679 | |
| b | 214679 |
| Distinct | 15 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 13.9 MiB |
| NEW YORK NY | |
|---|---|
| CHICAGO IL | |
| LOS ANGELES CA | |
| BOSTON MA | |
| ATLANTA GA | 5795 |
| Other values (10) |
Length
| Max length | 14 |
|---|---|
| Median length | 11 |
| Mean length | 10.79549933 |
| Min length | 8 |
Characters and Unicode
| Total characters | 2317567 |
|---|---|
| Distinct characters | 25 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | BOSTON MA |
|---|---|
| 2nd row | CHICAGO IL |
| 3rd row | NEW YORK NY |
| 4th row | LOS ANGELES CA |
| 5th row | CHICAGO IL |
| Value | Count | Frequency (%) |
| NEW YORK NY | 85918 | |
| CHICAGO IL | 47264 | |
| LOS ANGELES CA | 28168 | 13.1% |
| BOSTON MA | 24506 | 11.4% |
| ATLANTA GA | 5795 | 2.7% |
| DALLAS TX | 5637 | 2.6% |
| MIAMI FL | 4452 | 2.1% |
| AUSTIN TX | 3028 | 1.4% |
| ORANGE COUNTY | 2469 | 1.2% |
| DENVER CO | 2431 | 1.1% |
| Other values (5) | 5011 | 2.3% |
| Value | Count | Frequency (%) |
| ny | 85918 | |
| york | 85918 | |
| new | 85918 | |
| chicago | 47264 | |
| il | 47264 | |
| ca | 30179 | 5.5% |
| angeles | 28168 | 5.2% |
| los | 28168 | 5.2% |
| boston | 24506 | 4.5% |
| ma | 24506 | 4.5% |
| Other values (20) | 56613 |
Most occurring characters
| Value | Count | Frequency (%) |
| 329743 | ||
| N | 246251 | |
| O | 220942 | |
| A | 180564 | 7.8% |
| Y | 174305 | 7.5% |
| E | 153965 | 6.6% |
| C | 130640 | 5.6% |
| L | 127459 | 5.5% |
| I | 110438 | 4.8% |
| S | 93318 | 4.0% |
| Other values (15) | 549942 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 1987824 | |
| Space Separator | 329743 | 14.2% |
Most frequent character per category
| Value | Count | Frequency (%) |
| N | 246251 | |
| O | 220942 | |
| A | 180564 | 9.1% |
| Y | 174305 | 8.8% |
| E | 153965 | 7.7% |
| C | 130640 | 6.6% |
| L | 127459 | 6.4% |
| I | 110438 | 5.6% |
| S | 93318 | 4.7% |
| R | 92482 | 4.7% |
| Other values (14) | 457460 |
| Value | Count | Frequency (%) |
| 329743 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1987824 | |
| Common | 329743 | 14.2% |
Most frequent character per script
| Value | Count | Frequency (%) |
| N | 246251 | |
| O | 220942 | |
| A | 180564 | 9.1% |
| Y | 174305 | 8.8% |
| E | 153965 | 7.7% |
| C | 130640 | 6.6% |
| L | 127459 | 6.4% |
| I | 110438 | 5.6% |
| S | 93318 | 4.7% |
| R | 92482 | 4.7% |
| Other values (14) | 457460 |
| Value | Count | Frequency (%) |
| 329743 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2317567 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 329743 | ||
| N | 246251 | |
| O | 220942 | |
| A | 180564 | 7.8% |
| Y | 174305 | 7.5% |
| E | 153965 | 6.6% |
| C | 130640 | 5.6% |
| L | 127459 | 5.5% |
| I | 110438 | 4.8% |
| S | 93318 | 4.0% |
| Other values (15) | 549942 |
| Distinct | 874 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 22.56776252 |
|---|---|
| Minimum | 1.9 |
| Maximum | 48 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.6 MiB |
Quantile statistics
| Minimum | 1.9 |
|---|---|
| 5-th percentile | 3.57 |
| Q1 | 12 |
| median | 22.47 |
| Q3 | 32.98 |
| 95-th percentile | 42 |
| Maximum | 48 |
| Range | 46.1 |
| Interquartile range (IQR) | 20.98 |
Descriptive statistics
| Standard deviation | 12.23534095 |
|---|---|
| Coefficient of variation (CV) | 0.542160125 |
| Kurtosis | -1.125975613 |
| Mean | 22.56776252 |
| Median Absolute Deviation (MAD) | 10.49 |
| Skewness | 0.05375046895 |
| Sum | 4844824.69 |
| Variance | 149.7035681 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 33.6 | 922 | 0.4% |
| 22.8 | 669 | 0.3% |
| 24 | 654 | 0.3% |
| 35.7 | 644 | 0.3% |
| 37.44 | 639 | 0.3% |
| 16.8 | 628 | 0.3% |
| 39.6 | 609 | 0.3% |
| 28.08 | 568 | 0.3% |
| 21.85 | 479 | 0.2% |
| 19.2 | 469 | 0.2% |
| Other values (864) | 208398 |
| Value | Count | Frequency (%) |
| 1.9 | 213 | |
| 1.92 | 226 | |
| 1.94 | 206 | |
| 1.96 | 219 | |
| 1.98 | 234 | |
| 2 | 224 | |
| 2.02 | 210 | |
| 2.04 | 223 | |
| 2.06 | 188 | |
| 2.08 | 225 |
| Value | Count | Frequency (%) |
| 48 | 221 | |
| 47.6 | 246 | |
| 47.2 | 233 | |
| 46.8 | 436 | |
| 46.41 | 240 | |
| 46.4 | 209 | |
| 46.02 | 244 | |
| 46 | 196 | |
| 45.63 | 197 | |
| 45.6 | 416 |
Price Charged
Real number (ℝ≥0)
| Distinct | 92003 |
|---|---|
| Distinct (%) | 42.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 476.4425666 |
|---|---|
| Minimum | 20.73 |
| Maximum | 2048.03 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.6 MiB |
Quantile statistics
| Minimum | 20.73 |
|---|---|
| 5-th percentile | 71.71 |
| Q1 | 234.815 |
| median | 439.86 |
| Q3 | 660.83 |
| 95-th percentile | 1040.581 |
| Maximum | 2048.03 |
| Range | 2027.3 |
| Interquartile range (IQR) | 426.015 |
Descriptive statistics
| Standard deviation | 300.6089791 |
|---|---|
| Coefficient of variation (CV) | 0.6309448403 |
| Kurtosis | 0.2016614755 |
| Mean | 476.4425666 |
| Median Absolute Deviation (MAD) | 212.1 |
| Skewness | 0.7192370246 |
| Sum | 102282213.8 |
| Variance | 90365.75832 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 538.44 | 12 | < 0.1% |
| 341.16 | 12 | < 0.1% |
| 85.97 | 11 | < 0.1% |
| 196.03 | 11 | < 0.1% |
| 577.16 | 11 | < 0.1% |
| 185.38 | 11 | < 0.1% |
| 475.02 | 11 | < 0.1% |
| 79.38 | 11 | < 0.1% |
| 625.58 | 11 | < 0.1% |
| 261.66 | 10 | < 0.1% |
| Other values (91993) | 214568 |
| Value | Count | Frequency (%) |
| 20.73 | 1 | |
| 22 | 1 | |
| 22.04 | 1 | |
| 22.11 | 1 | |
| 22.37 | 1 | |
| 22.42 | 1 | |
| 22.52 | 1 | |
| 22.59 | 1 | |
| 22.79 | 1 | |
| 22.81 | 1 |
| Value | Count | Frequency (%) |
| 2048.03 | 1 | |
| 2016.7 | 1 | |
| 2013.95 | 1 | |
| 1993.83 | 1 | |
| 1981.05 | 1 | |
| 1978.79 | 1 | |
| 1957.1 | 1 | |
| 1947.91 | 1 | |
| 1925.92 | 1 | |
| 1920.59 | 1 |
| Distinct | 9808 |
|---|---|
| Distinct (%) | 4.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 297.9044013 |
|---|---|
| Minimum | 22.8 |
| Maximum | 691.2 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.6 MiB |
Quantile statistics
| Minimum | 22.8 |
|---|---|
| 5-th percentile | 47.6064 |
| Q1 | 158.4 |
| median | 295.608 |
| Q3 | 432.5616 |
| 95-th percentile | 558.144 |
| Maximum | 691.2 |
| Range | 668.4 |
| Interquartile range (IQR) | 274.1616 |
Descriptive statistics
| Standard deviation | 162.5627971 |
|---|---|
| Coefficient of variation (CV) | 0.5456877992 |
| Kurtosis | -1.076249653 |
| Mean | 297.9044013 |
| Median Absolute Deviation (MAD) | 137.0256 |
| Skewness | 0.08572898348 |
| Sum | 63953818.96 |
| Variance | 26426.663 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 479.808 | 139 | 0.1% |
| 471.744 | 129 | 0.1% |
| 362.88 | 118 | 0.1% |
| 488.376 | 115 | 0.1% |
| 370.656 | 112 | 0.1% |
| 403.2 | 108 | 0.1% |
| 423.36 | 108 | 0.1% |
| 435.456 | 106 | < 0.1% |
| 287.28 | 106 | < 0.1% |
| 241.92 | 105 | < 0.1% |
| Other values (9798) | 213533 |
| Value | Count | Frequency (%) |
| 22.8 | 12 | |
| 23.028 | 9 | |
| 23.04 | 10 | |
| 23.256 | 7 | |
| 23.2704 | 11 | |
| 23.28 | 5 | < 0.1% |
| 23.484 | 12 | |
| 23.5008 | 14 | |
| 23.5128 | 10 | |
| 23.52 | 9 |
| Value | Count | Frequency (%) |
| 691.2 | 7 | < 0.1% |
| 685.44 | 24 | |
| 679.728 | 9 | < 0.1% |
| 679.68 | 24 | |
| 674.016 | 28 | |
| 673.92 | 28 | |
| 668.352 | 12 | < 0.1% |
| 668.304 | 42 | |
| 668.16 | 18 | |
| 662.7348 | 15 | < 0.1% |
Customer ID
Real number (ℝ≥0)
| Distinct | 28458 |
|---|---|
| Distinct (%) | 13.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 12832.31576 |
|---|---|
| Minimum | 1 |
| Maximum | 60000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.6 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 380 |
| Q1 | 1876 |
| median | 4372 |
| Q3 | 8966 |
| 95-th percentile | 58682 |
| Maximum | 60000 |
| Range | 59999 |
| Interquartile range (IQR) | 7090 |
Descriptive statistics
| Standard deviation | 18731.15845 |
|---|---|
| Coefficient of variation (CV) | 1.459686529 |
| Kurtosis | 1.403269138 |
| Mean | 12832.31576 |
| Median Absolute Deviation (MAD) | 3018 |
| Skewness | 1.715258081 |
| Sum | 2754828715 |
| Variance | 350856297 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 1803 | 47 | < 0.1% |
| 1360 | 47 | < 0.1% |
| 494 | 47 | < 0.1% |
| 636 | 46 | < 0.1% |
| 903 | 45 | < 0.1% |
| 2766 | 45 | < 0.1% |
| 126 | 45 | < 0.1% |
| 2577 | 44 | < 0.1% |
| 992 | 44 | < 0.1% |
| 1070 | 44 | < 0.1% |
| Other values (28448) | 214225 |
| Value | Count | Frequency (%) |
| 1 | 25 | |
| 2 | 36 | |
| 3 | 40 | |
| 4 | 25 | |
| 5 | 23 | |
| 6 | 23 | |
| 7 | 34 | |
| 8 | 29 | |
| 9 | 35 | |
| 10 | 21 |
| Value | Count | Frequency (%) |
| 60000 | 14 | |
| 59999 | 6 | |
| 59998 | 6 | |
| 59997 | 8 | |
| 59996 | 4 | < 0.1% |
| 59995 | 11 | |
| 59994 | 10 | |
| 59993 | 12 | |
| 59992 | 8 | |
| 59991 | 7 |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 12.5 MiB |
| Card | |
|---|---|
| Cash |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Characters and Unicode
| Total characters | 858716 |
|---|---|
| Distinct characters | 6 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Card |
|---|---|
| 2nd row | Cash |
| 3rd row | Cash |
| 4th row | Cash |
| 5th row | Cash |
| Value | Count | Frequency (%) |
| Card | 128792 | |
| Cash | 85887 |
| Value | Count | Frequency (%) |
| card | 128792 | |
| cash | 85887 |
Most occurring characters
| Value | Count | Frequency (%) |
| C | 214679 | |
| a | 214679 | |
| r | 128792 | |
| d | 128792 | |
| s | 85887 | |
| h | 85887 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 644037 | |
| Uppercase Letter | 214679 | 25.0% |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 214679 | |
| r | 128792 | |
| d | 128792 | |
| s | 85887 | |
| h | 85887 |
| Value | Count | Frequency (%) |
| C | 214679 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 858716 |
Most frequent character per script
| Value | Count | Frequency (%) |
| C | 214679 | |
| a | 214679 | |
| r | 128792 | |
| d | 128792 | |
| s | 85887 | |
| h | 85887 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 858716 |
Most frequent character per block
| Value | Count | Frequency (%) |
| C | 214679 | |
| a | 214679 | |
| r | 128792 | |
| d | 128792 | |
| s | 85887 | |
| h | 85887 |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 12.7 MiB |
| Male | |
|---|---|
| Female |
Length
| Max length | 6 |
|---|---|
| Median length | 4 |
| Mean length | 4.829675935 |
| Min length | 4 |
Characters and Unicode
| Total characters | 1036830 |
|---|---|
| Distinct characters | 6 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Male |
|---|---|
| 2nd row | Male |
| 3rd row | Male |
| 4th row | Male |
| 5th row | Male |
| Value | Count | Frequency (%) |
| Male | 125622 | |
| Female | 89057 |
| Value | Count | Frequency (%) |
| male | 125622 | |
| female | 89057 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 303736 | |
| a | 214679 | |
| l | 214679 | |
| M | 125622 | |
| F | 89057 | 8.6% |
| m | 89057 | 8.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 822151 | |
| Uppercase Letter | 214679 | 20.7% |
Most frequent character per category
| Value | Count | Frequency (%) |
| e | 303736 | |
| a | 214679 | |
| l | 214679 | |
| m | 89057 | 10.8% |
| Value | Count | Frequency (%) |
| M | 125622 | |
| F | 89057 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1036830 |
Most frequent character per script
| Value | Count | Frequency (%) |
| e | 303736 | |
| a | 214679 | |
| l | 214679 | |
| M | 125622 | |
| F | 89057 | 8.6% |
| m | 89057 | 8.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1036830 |
Most frequent character per block
| Value | Count | Frequency (%) |
| e | 303736 | |
| a | 214679 | |
| l | 214679 | |
| M | 125622 | |
| F | 89057 | 8.6% |
| m | 89057 | 8.6% |
Age
Real number (ℝ≥0)
| Distinct | 48 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 35.36739504 |
|---|---|
| Minimum | 18 |
| Maximum | 65 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.6 MiB |
Quantile statistics
| Minimum | 18 |
|---|---|
| 5-th percentile | 19 |
| Q1 | 25 |
| median | 33 |
| Q3 | 42 |
| 95-th percentile | 61 |
| Maximum | 65 |
| Range | 47 |
| Interquartile range (IQR) | 17 |
Descriptive statistics
| Standard deviation | 12.62347245 |
|---|---|
| Coefficient of variation (CV) | 0.3569240096 |
| Kurtosis | -0.4762890592 |
| Mean | 35.36739504 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 0.6796008014 |
| Sum | 7592637 |
| Variance | 159.3520567 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 23 | 7580 | 3.5% |
| 20 | 7373 | 3.4% |
| 32 | 7256 | 3.4% |
| 27 | 7112 | 3.3% |
| 21 | 7088 | 3.3% |
| 25 | 7078 | 3.3% |
| 22 | 7070 | 3.3% |
| 39 | 6990 | 3.3% |
| 33 | 6988 | 3.3% |
| 19 | 6978 | 3.3% |
| Other values (38) | 143166 |
| Value | Count | Frequency (%) |
| 18 | 6258 | |
| 19 | 6978 | |
| 20 | 7373 | |
| 21 | 7088 | |
| 22 | 7070 | |
| 23 | 7580 | |
| 24 | 6644 | |
| 25 | 7078 | |
| 26 | 6844 | |
| 27 | 7112 |
| Value | Count | Frequency (%) |
| 65 | 1992 | |
| 64 | 2365 | |
| 63 | 2183 | |
| 62 | 2189 | |
| 61 | 2598 | |
| 60 | 2397 | |
| 59 | 2364 | |
| 58 | 2453 | |
| 57 | 2109 | |
| 56 | 2276 |
Income (USD/Month)
Real number (ℝ≥0)
| Distinct | 17824 |
|---|---|
| Distinct (%) | 8.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15099.1665 |
|---|---|
| Minimum | 2007 |
| Maximum | 34996 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.6 MiB |
Quantile statistics
| Minimum | 2007 |
|---|---|
| 5-th percentile | 3238 |
| Q1 | 8495 |
| median | 14737 |
| Q3 | 21049 |
| 95-th percentile | 29694 |
| Maximum | 34996 |
| Range | 32989 |
| Interquartile range (IQR) | 12554 |
Descriptive statistics
| Standard deviation | 7976.343733 |
|---|---|
| Coefficient of variation (CV) | 0.528263844 |
| Kurtosis | -0.6608618242 |
| Mean | 15099.1665 |
| Median Absolute Deviation (MAD) | 6282 |
| Skewness | 0.3022976191 |
| Sum | 3241473964 |
| Variance | 63622059.35 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 8899 | 110 | 0.1% |
| 22525 | 109 | 0.1% |
| 9797 | 100 | < 0.1% |
| 16137 | 99 | < 0.1% |
| 17580 | 98 | < 0.1% |
| 13413 | 95 | < 0.1% |
| 19256 | 95 | < 0.1% |
| 16512 | 93 | < 0.1% |
| 9866 | 90 | < 0.1% |
| 20884 | 88 | < 0.1% |
| Other values (17814) | 213702 |
| Value | Count | Frequency (%) |
| 2007 | 16 | < 0.1% |
| 2009 | 1 | < 0.1% |
| 2010 | 1 | < 0.1% |
| 2011 | 1 | < 0.1% |
| 2012 | 67 | |
| 2013 | 3 | < 0.1% |
| 2015 | 2 | < 0.1% |
| 2017 | 2 | < 0.1% |
| 2019 | 17 | < 0.1% |
| 2020 | 5 | < 0.1% |
| Value | Count | Frequency (%) |
| 34996 | 2 | < 0.1% |
| 34995 | 2 | < 0.1% |
| 34989 | 23 | |
| 34985 | 14 | |
| 34984 | 9 | < 0.1% |
| 34983 | 2 | < 0.1% |
| 34973 | 1 | < 0.1% |
| 34972 | 7 | < 0.1% |
| 34968 | 11 | |
| 34967 | 13 |
| Distinct | 15 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4152714.67 |
|---|---|
| Minimum | 248968 |
| Maximum | 8405837 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.6 MiB |
Quantile statistics
| Minimum | 248968 |
|---|---|
| 5-th percentile | 248968 |
| Q1 | 1595037 |
| median | 1955130 |
| Q3 | 8405837 |
| 95-th percentile | 8405837 |
| Maximum | 8405837 |
| Range | 8156869 |
| Interquartile range (IQR) | 6810800 |
Descriptive statistics
| Standard deviation | 3511699.739 |
|---|---|
| Coefficient of variation (CV) | 0.8456395438 |
| Kurtosis | -1.791341632 |
| Mean | 4152714.67 |
| Median Absolute Deviation (MAD) | 1706162 |
| Skewness | 0.3407323926 |
| Sum | 8.915006326 × 1011 |
| Variance | 1.233203506 × 1013 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 8405837 | 85918 | |
| 1955130 | 47264 | |
| 1595037 | 28168 | 13.1% |
| 248968 | 24506 | 11.4% |
| 814885 | 5795 | 2.7% |
| 942908 | 5637 | 2.6% |
| 1339155 | 4452 | 2.1% |
| 698371 | 3028 | 1.4% |
| 1030185 | 2469 | 1.2% |
| 754233 | 2431 | 1.1% |
| Other values (5) | 5011 | 2.3% |
| Value | Count | Frequency (%) |
| 248968 | 24506 | |
| 327225 | 1169 | 0.5% |
| 542085 | 631 | 0.3% |
| 545776 | 1033 | 0.5% |
| 698371 | 3028 | 1.4% |
| 754233 | 2431 | 1.1% |
| 814885 | 5795 | 2.7% |
| 942908 | 5637 | 2.6% |
| 943999 | 1200 | 0.6% |
| 959307 | 978 | 0.5% |
| Value | Count | Frequency (%) |
| 8405837 | 85918 | |
| 1955130 | 47264 | |
| 1595037 | 28168 | 13.1% |
| 1339155 | 4452 | 2.1% |
| 1030185 | 2469 | 1.2% |
| 959307 | 978 | 0.5% |
| 943999 | 1200 | 0.6% |
| 942908 | 5637 | 2.6% |
| 814885 | 5795 | 2.7% |
| 754233 | 2431 | 1.1% |
| Distinct | 15 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 187745.1734 |
|---|---|
| Minimum | 3643 |
| Maximum | 302149 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.6 MiB |
Quantile statistics
| Minimum | 3643 |
|---|---|
| 5-th percentile | 14978 |
| Q1 | 144132 |
| median | 164468 |
| Q3 | 302149 |
| 95-th percentile | 302149 |
| Maximum | 302149 |
| Range | 298506 |
| Interquartile range (IQR) | 158017 |
Descriptive statistics
| Standard deviation | 103764.7419 |
|---|---|
| Coefficient of variation (CV) | 0.5526892651 |
| Kurtosis | -1.312639547 |
| Mean | 187745.1734 |
| Median Absolute Deviation (MAD) | 137681 |
| Skewness | -0.187138635 |
| Sum | 4.030494609 × 1010 |
| Variance | 1.076712167 × 1010 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 302149 | 85918 | |
| 164468 | 47264 | |
| 144132 | 28168 | 13.1% |
| 80021 | 24506 | 11.4% |
| 24701 | 5795 | 2.7% |
| 22157 | 5637 | 2.6% |
| 17675 | 4452 | 2.1% |
| 14978 | 3028 | 1.4% |
| 12994 | 2469 | 1.2% |
| 12421 | 2431 | 1.1% |
| Other values (5) | 5011 | 2.3% |
| Value | Count | Frequency (%) |
| 3643 | 631 | 0.3% |
| 6133 | 1200 | 0.6% |
| 7044 | 1033 | 0.5% |
| 9270 | 1169 | 0.5% |
| 12421 | 2431 | |
| 12994 | 2469 | |
| 14978 | 3028 | |
| 17675 | 4452 | |
| 22157 | 5637 | |
| 24701 | 5795 |
| Value | Count | Frequency (%) |
| 302149 | 85918 | |
| 164468 | 47264 | |
| 144132 | 28168 | 13.1% |
| 80021 | 24506 | 11.4% |
| 69995 | 978 | 0.5% |
| 24701 | 5795 | 2.7% |
| 22157 | 5637 | 2.6% |
| 17675 | 4452 | 2.1% |
| 14978 | 3028 | 1.4% |
| 12994 | 2469 | 1.2% |
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 11.9 MiB |
| - | |
|---|---|
| Christmas Day | 735 |
| Thanksgiving Day | 624 |
| Veterans Day | 506 |
| Labor Day | 380 |
| Other values (6) | 1676 |
Length
| Max length | 37 |
|---|---|
| Median length | 1 |
| Mean length | 1.261278467 |
| Min length | 1 |
Characters and Unicode
| Total characters | 270770 |
|---|---|
| Distinct characters | 40 |
| Distinct categories | 7 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | - |
|---|---|
| 2nd row | - |
| 3rd row | - |
| 4th row | - |
| 5th row | - |
| Value | Count | Frequency (%) |
| - | 210758 | |
| Christmas Day | 735 | 0.3% |
| Thanksgiving Day | 624 | 0.3% |
| Veterans Day | 506 | 0.2% |
| Labor Day | 380 | 0.2% |
| Columbus Day | 362 | 0.2% |
| Independence Day | 340 | 0.2% |
| Memorial Day | 335 | 0.2% |
| Presidents Day (Washingtons Birthday) | 251 | 0.1% |
| Martin Luther King Jr. Day | 231 | 0.1% |
| Value | Count | Frequency (%) |
| 210758 | ||
| day | 3921 | 1.8% |
| christmas | 735 | 0.3% |
| thanksgiving | 624 | 0.3% |
| veterans | 506 | 0.2% |
| labor | 380 | 0.2% |
| columbus | 362 | 0.2% |
| independence | 340 | 0.2% |
| memorial | 335 | 0.2% |
| washingtons | 251 | 0.1% |
| Other values (8) | 1740 | 0.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| - | 210758 | |
| a | 7391 | 2.7% |
| 5273 | 1.9% | |
| y | 4172 | 1.5% |
| n | 3989 | 1.5% |
| s | 3966 | 1.5% |
| D | 3921 | 1.4% |
| e | 3754 | 1.4% |
| i | 3533 | 1.3% |
| r | 3308 | 1.2% |
| Other values (30) | 20705 | 7.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Dash Punctuation | 210758 | |
| Lowercase Letter | 44812 | 16.5% |
| Uppercase Letter | 9194 | 3.4% |
| Space Separator | 5273 | 1.9% |
| Open Punctuation | 251 | 0.1% |
| Close Punctuation | 251 | 0.1% |
| Other Punctuation | 231 | 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 7391 | |
| y | 4172 | |
| n | 3989 | |
| s | 3966 | |
| e | 3754 | |
| i | 3533 | |
| r | 3308 | 7.4% |
| t | 2456 | 5.5% |
| h | 2092 | 4.7% |
| g | 1730 | 3.9% |
| Other values (11) | 8421 |
| Value | Count | Frequency (%) |
| D | 3921 | |
| C | 1097 | 11.9% |
| T | 624 | 6.8% |
| L | 611 | 6.6% |
| M | 566 | 6.2% |
| V | 506 | 5.5% |
| I | 340 | 3.7% |
| P | 251 | 2.7% |
| W | 251 | 2.7% |
| B | 251 | 2.7% |
| Other values (4) | 776 | 8.4% |
| Value | Count | Frequency (%) |
| - | 210758 |
| Value | Count | Frequency (%) |
| 5273 |
| Value | Count | Frequency (%) |
| . | 231 |
| Value | Count | Frequency (%) |
| ( | 251 |
| Value | Count | Frequency (%) |
| ) | 251 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 216764 | |
| Latin | 54006 | 19.9% |
Most frequent character per script
| Value | Count | Frequency (%) |
| a | 7391 | |
| y | 4172 | 7.7% |
| n | 3989 | 7.4% |
| s | 3966 | 7.3% |
| D | 3921 | 7.3% |
| e | 3754 | 7.0% |
| i | 3533 | 6.5% |
| r | 3308 | 6.1% |
| t | 2456 | 4.5% |
| h | 2092 | 3.9% |
| Other values (25) | 15424 |
| Value | Count | Frequency (%) |
| - | 210758 | |
| 5273 | 2.4% | |
| ( | 251 | 0.1% |
| ) | 251 | 0.1% |
| . | 231 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 270770 |
Most frequent character per block
| Value | Count | Frequency (%) |
| - | 210758 | |
| a | 7391 | 2.7% |
| 5273 | 1.9% | |
| y | 4172 | 1.5% |
| n | 3989 | 1.5% |
| s | 3966 | 1.5% |
| D | 3921 | 1.4% |
| e | 3754 | 1.4% |
| i | 3533 | 1.3% |
| r | 3308 | 1.2% |
| Other values (30) | 20705 | 7.6% |
Profit
Real number (ℝ)
| Distinct | 201424 |
|---|---|
| Distinct (%) | 93.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 178.5381653 |
|---|---|
| Minimum | -160.714 |
| Maximum | 1463.966 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.6 MiB |
Quantile statistics
| Minimum | -160.714 |
|---|---|
| 5-th percentile | 0.72796 |
| Q1 | 42.159 |
| median | 117.5772 |
| Q3 | 263.699 |
| 95-th percentile | 556.4564 |
| Maximum | 1463.966 |
| Range | 1624.68 |
| Interquartile range (IQR) | 221.54 |
Descriptive statistics
| Standard deviation | 183.2411271 |
|---|---|
| Coefficient of variation (CV) | 1.026341492 |
| Kurtosis | 2.317422415 |
| Mean | 178.5381653 |
| Median Absolute Deviation (MAD) | 90.9692 |
| Skewness | 1.479546876 |
| Sum | 38328394.8 |
| Variance | 33577.31065 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 83.19 | 8 | < 0.1% |
| 4.75 | 7 | < 0.1% |
| 67.88 | 6 | < 0.1% |
| 15.18 | 6 | < 0.1% |
| 20.25 | 6 | < 0.1% |
| 31.75 | 6 | < 0.1% |
| 50.14 | 6 | < 0.1% |
| 14.31 | 6 | < 0.1% |
| 12.25 | 6 | < 0.1% |
| 113.63 | 6 | < 0.1% |
| Other values (201414) | 214616 |
| Value | Count | Frequency (%) |
| -160.714 | 1 | |
| -145.9468 | 1 | |
| -144.7664 | 1 | |
| -144.4464 | 1 | |
| -135.8752 | 1 | |
| -134.74 | 1 | |
| -134.204 | 1 | |
| -133.682 | 1 | |
| -133.672 | 1 | |
| -133.208 | 1 |
| Value | Count | Frequency (%) |
| 1463.966 | 1 | |
| 1445.272 | 1 | |
| 1433.342 | 1 | |
| 1424.1408 | 1 | |
| 1408.344 | 1 | |
| 1408.0252 | 1 | |
| 1399.11 | 1 | |
| 1390.4464 | 1 | |
| 1371.626 | 1 | |
| 1338.92 | 1 |
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 12.9 MiB |
| 2017.0 | |
|---|---|
| 2018.0 | |
| 2016.0 |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 6 |
| Min length | 6 |
Characters and Unicode
| Total characters | 1288074 |
|---|---|
| Distinct characters | 7 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2016.0 |
|---|---|
| 2nd row | 2016.0 |
| 3rd row | 2016.0 |
| 4th row | 2016.0 |
| 5th row | 2016.0 |
| Value | Count | Frequency (%) |
| 2017.0 | 77086 | |
| 2018.0 | 73449 | |
| 2016.0 | 64144 |
| Value | Count | Frequency (%) |
| 2017.0 | 77086 | |
| 2018.0 | 73449 | |
| 2016.0 | 64144 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 429358 | |
| 2 | 214679 | |
| 1 | 214679 | |
| . | 214679 | |
| 7 | 77086 | 6.0% |
| 8 | 73449 | 5.7% |
| 6 | 64144 | 5.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1073395 | |
| Other Punctuation | 214679 | 16.7% |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 429358 | |
| 2 | 214679 | |
| 1 | 214679 | |
| 7 | 77086 | 7.2% |
| 8 | 73449 | 6.8% |
| 6 | 64144 | 6.0% |
| Value | Count | Frequency (%) |
| . | 214679 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1288074 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 0 | 429358 | |
| 2 | 214679 | |
| 1 | 214679 | |
| . | 214679 | |
| 7 | 77086 | 6.0% |
| 8 | 73449 | 5.7% |
| 6 | 64144 | 5.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1288074 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 429358 | |
| 2 | 214679 | |
| 1 | 214679 | |
| . | 214679 | |
| 7 | 77086 | 6.0% |
| 8 | 73449 | 5.7% |
| 6 | 64144 | 5.0% |
Month
Real number (ℝ≥0)
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.469282044 |
|---|---|
| Minimum | 1 |
| Maximum | 12 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.6 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 5 |
| median | 8 |
| Q3 | 11 |
| 95-th percentile | 12 |
| Maximum | 12 |
| Range | 11 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 3.470552254 |
|---|---|
| Coefficient of variation (CV) | 0.4646433531 |
| Kurtosis | -1.085846652 |
| Mean | 7.469282044 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | -0.3797205204 |
| Sum | 1603498 |
| Variance | 12.04473295 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 12 | 28617 | |
| 11 | 25343 | |
| 10 | 24013 | |
| 9 | 21442 | |
| 8 | 18303 | |
| 7 | 16188 | |
| 6 | 14516 | |
| 5 | 14406 | |
| 1 | 13975 | |
| 3 | 13264 | |
| Other values (2) | 24612 |
| Value | Count | Frequency (%) |
| 1 | 13975 | |
| 2 | 11434 | |
| 3 | 13264 | |
| 4 | 13178 | |
| 5 | 14406 | |
| 6 | 14516 | |
| 7 | 16188 | |
| 8 | 18303 | |
| 9 | 21442 | |
| 10 | 24013 |
| Value | Count | Frequency (%) |
| 12 | 28617 | |
| 11 | 25343 | |
| 10 | 24013 | |
| 9 | 21442 | |
| 8 | 18303 | |
| 7 | 16188 | |
| 6 | 14516 | |
| 5 | 14406 | |
| 4 | 13178 | |
| 3 | 13264 |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 13.1 MiB |
| Friday | |
|---|---|
| Saturday | |
| Sunday | |
| Thursday | |
| Wednesday | |
| Other values (2) |
Length
| Max length | 9 |
|---|---|
| Median length | 6 |
| Mean length | 6.990525389 |
| Min length | 6 |
Characters and Unicode
| Total characters | 1500719 |
|---|---|
| Distinct characters | 17 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Saturday |
|---|---|
| 2nd row | Saturday |
| 3rd row | Saturday |
| 4th row | Saturday |
| 5th row | Saturday |
| Value | Count | Frequency (%) |
| Friday | 48528 | |
| Saturday | 46892 | |
| Sunday | 42156 | |
| Thursday | 23503 | |
| Wednesday | 18031 | 8.4% |
| Monday | 17807 | 8.3% |
| Tuesday | 17762 | 8.3% |
| Value | Count | Frequency (%) |
| friday | 48528 | |
| saturday | 46892 | |
| sunday | 42156 | |
| thursday | 23503 | |
| wednesday | 18031 | 8.4% |
| monday | 17807 | 8.3% |
| tuesday | 17762 | 8.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 261571 | |
| d | 232710 | |
| y | 214679 | |
| u | 130313 | |
| r | 118923 | |
| S | 89048 | 5.9% |
| n | 77994 | 5.2% |
| s | 59296 | 4.0% |
| e | 53824 | 3.6% |
| F | 48528 | 3.2% |
| Other values (7) | 213833 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 1286040 | |
| Uppercase Letter | 214679 | 14.3% |
Most frequent character per category
| Value | Count | Frequency (%) |
| a | 261571 | |
| d | 232710 | |
| y | 214679 | |
| u | 130313 | |
| r | 118923 | |
| n | 77994 | 6.1% |
| s | 59296 | 4.6% |
| e | 53824 | 4.2% |
| i | 48528 | 3.8% |
| t | 46892 | 3.6% |
| Other values (2) | 41310 | 3.2% |
| Value | Count | Frequency (%) |
| S | 89048 | |
| F | 48528 | |
| T | 41265 | |
| W | 18031 | 8.4% |
| M | 17807 | 8.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 1500719 |
Most frequent character per script
| Value | Count | Frequency (%) |
| a | 261571 | |
| d | 232710 | |
| y | 214679 | |
| u | 130313 | |
| r | 118923 | |
| S | 89048 | 5.9% |
| n | 77994 | 5.2% |
| s | 59296 | 4.0% |
| e | 53824 | 3.6% |
| F | 48528 | 3.2% |
| Other values (7) | 213833 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1500719 |
Most frequent character per block
| Value | Count | Frequency (%) |
| a | 261571 | |
| d | 232710 | |
| y | 214679 | |
| u | 130313 | |
| r | 118923 | |
| S | 89048 | 5.9% |
| n | 77994 | 5.2% |
| s | 59296 | 4.0% |
| e | 53824 | 3.6% |
| F | 48528 | 3.2% |
| Other values (7) | 213833 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| df_index | Transaction ID | Date of Travel | Company | City | KM Travelled | Price Charged | Cost of Trip | Customer ID | Payment_Mode | Gender | Age | Income (USD/Month) | Population | Users | Holiday | Profit | Year | Month | Day of Week | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 10000429.0 | 2016-01-02 | Yellow Cab | BOSTON MA | 15.15 | 342.62 | 205.4340 | 57474.0 | Card | Male | 34.0 | 16558.0 | 248968.0 | 80021.0 | - | 137.1860 | 2016.0 | 1.0 | Saturday |
| 1 | 1 | 10000525.0 | 2016-01-02 | Yellow Cab | CHICAGO IL | 2.18 | 51.47 | 26.4216 | 4551.0 | Cash | Male | 19.0 | 6316.0 | 1955130.0 | 164468.0 | - | 25.0484 | 2016.0 | 1.0 | Saturday |
| 2 | 4 | 10000927.0 | 2016-01-02 | Yellow Cab | NEW YORK NY | 34.56 | 1121.11 | 485.2224 | 1808.0 | Cash | Male | 59.0 | 18999.0 | 8405837.0 | 302149.0 | - | 635.8876 | 2016.0 | 1.0 | Saturday |
| 3 | 5 | 10000721.0 | 2016-01-02 | Yellow Cab | LOS ANGELES CA | 19.20 | 529.23 | 246.5280 | 8117.0 | Cash | Male | 21.0 | 5946.0 | 1595037.0 | 144132.0 | - | 282.7020 | 2016.0 | 1.0 | Saturday |
| 4 | 7 | 10000519.0 | 2016-01-02 | Yellow Cab | CHICAGO IL | 13.92 | 327.23 | 185.4144 | 4429.0 | Cash | Male | 20.0 | 23387.0 | 1955130.0 | 164468.0 | - | 141.8156 | 2016.0 | 1.0 | Saturday |
| 5 | 10 | 10000806.0 | 2016-01-02 | Yellow Cab | NEW YORK NY | 10.62 | 347.48 | 138.9096 | 2153.0 | Card | Male | 18.0 | 8193.0 | 8405837.0 | 302149.0 | - | 208.5704 | 2016.0 | 1.0 | Saturday |
| 6 | 11 | 10001009.0 | 2016-01-02 | Yellow Cab | PHOENIX AZ | 30.00 | 1000.52 | 403.2000 | 21481.0 | Card | Male | 28.0 | 18030.0 | 943999.0 | 6133.0 | - | 597.3200 | 2016.0 | 1.0 | Saturday |
| 7 | 12 | 10000516.0 | 2016-01-02 | Yellow Cab | CHICAGO IL | 4.72 | 105.79 | 62.8704 | 5803.0 | Card | Male | 54.0 | 4964.0 | 1955130.0 | 164468.0 | - | 42.9196 | 2016.0 | 1.0 | Saturday |
| 8 | 13 | 10000663.0 | 2016-01-02 | Yellow Cab | DALLAS TX | 12.98 | 382.31 | 165.1056 | 26299.0 | Card | Male | 37.0 | 2729.0 | 942908.0 | 22157.0 | - | 217.2044 | 2016.0 | 1.0 | Saturday |
| 9 | 16 | 10000906.0 | 2016-01-02 | Yellow Cab | NEW YORK NY | 19.57 | 552.86 | 248.9304 | 2682.0 | Card | Male | 36.0 | 15280.0 | 8405837.0 | 302149.0 | - | 303.9296 | 2016.0 | 1.0 | Saturday |
Last rows
| df_index | Transaction ID | Date of Travel | Company | City | KM Travelled | Price Charged | Cost of Trip | Customer ID | Payment_Mode | Gender | Age | Income (USD/Month) | Population | Users | Holiday | Profit | Year | Month | Day of Week | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 214669 | 359371 | 10435429.0 | 2018-12-31 | Yellow Cab | NEW YORK NY | 22.44 | 572.95 | 301.5936 | 1402.0 | Card | Female | 57.0 | 8870.0 | 8405837.0 | 302149.0 | - | 271.3564 | 2018.0 | 12.0 | Monday |
| 214670 | 359372 | 10437882.0 | 2018-12-31 | Yellow Cab | CHICAGO IL | 18.72 | 247.24 | 226.8864 | 5633.0 | Cash | Male | 49.0 | 23206.0 | 1955130.0 | 164468.0 | - | 20.3536 | 2018.0 | 12.0 | Monday |
| 214671 | 359373 | 10434558.0 | 2018-12-31 | Yellow Cab | CHICAGO IL | 35.34 | 471.81 | 496.1736 | 5085.0 | Card | Female | 33.0 | 10274.0 | 1955130.0 | 164468.0 | - | -24.3636 | 2018.0 | 12.0 | Monday |
| 214672 | 359374 | 10435235.0 | 2018-12-31 | Yellow Cab | NEW YORK NY | 3.80 | 82.52 | 53.3520 | 1762.0 | Card | Male | 62.0 | 4016.0 | 8405837.0 | 302149.0 | - | 29.1680 | 2018.0 | 12.0 | Monday |
| 214673 | 359375 | 10434288.0 | 2018-12-31 | Yellow Cab | BOSTON MA | 2.26 | 31.37 | 28.2048 | 59187.0 | Cash | Female | 52.0 | 13751.0 | 248968.0 | 80021.0 | - | 3.1652 | 2018.0 | 12.0 | Monday |
| 214674 | 359376 | 10434486.0 | 2018-12-31 | Yellow Cab | CHICAGO IL | 38.11 | 558.03 | 484.7592 | 5727.0 | Card | Female | 22.0 | 2106.0 | 1955130.0 | 164468.0 | - | 73.2708 | 2018.0 | 12.0 | Monday |
| 214675 | 359377 | 10435871.0 | 2018-12-31 | Yellow Cab | ORANGE COUNTY | 15.34 | 247.86 | 185.9208 | 16342.0 | Card | Female | 23.0 | 2677.0 | 1030185.0 | 12994.0 | - | 61.9392 | 2018.0 | 12.0 | Monday |
| 214676 | 359378 | 10438475.0 | 2018-12-31 | Yellow Cab | LOS ANGELES CA | 14.40 | 219.43 | 179.7120 | 7237.0 | Card | Female | 18.0 | 20571.0 | 1595037.0 | 144132.0 | - | 39.7180 | 2018.0 | 12.0 | Monday |
| 214677 | 359382 | 10438162.0 | 2018-12-31 | Yellow Cab | CHICAGO IL | 34.72 | 472.05 | 433.3056 | 4263.0 | Card | Male | 36.0 | 19488.0 | 1955130.0 | 164468.0 | - | 38.7444 | 2018.0 | 12.0 | Monday |
| 214678 | 359391 | 10434637.0 | 2018-12-31 | Yellow Cab | CHICAGO IL | 3.36 | 44.86 | 43.9488 | 5926.0 | Card | Male | 58.0 | 31841.0 | 1955130.0 | 164468.0 | - | 0.9112 | 2018.0 | 12.0 | Monday |